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Abstract 

Writing achievement is a complex skill set as characterized by the sociocognitive writing framework, 
including writing domain knowledge (e.g., sentence structure), general cognitive skills (e.g., critical 
thinking) and intra- (e.g., interest) and interpersonal (e.g., collaboration) subfactors. During students’ 
postsecondary careers, they need to write in different genres. Yet, we have limited understanding 
about the contribution of genre mastery to students’ writing achievement, which can affect their 
broader success (e.g., GPA). Partnering with six, diverse 4-year universities, we collected student 
responses to a standardized writing assessment and authentic course writing assignments which were 
coded for genre as: standardized, persuasive, inform/explore, and reflective. Using automated writing 
evaluation, we extracted approximately 50 linguistic features (e.g., vocabulary usage) from the 1,426 
writing samples. We present findings for genre-based feature distributions, cross-genre correlations, 
and implications for postsecondary writing education. 
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ABSTRACT: Writing achievement is a complex skill set as characterized by the sociocognitive writing 
framework, including writing domain knowledge (e.g., sentence structure), general cognitive skills (e.g., critical 
thinking) and intra- (e.g., interest) and interpersonal (e.g., collaboration) subfactors. During students’ 
postsecondary careers, they need to write in different genres. Yet, we have limited understanding about the 
contribution of genre mastery to students’ writing achievement which can affect their broader success (e.g., 
GPA). Partnering with six, diverse 4-year universities, we collected student responses to a standardized writing 
assessment and authentic course writing assignments which were coded for genre as: standardized, 
persuasive, inform/explore, and reflective. Using automated writing evaluation, we extracted approximately 50 
linguistic features (e.g., vocabulary usage) from the 1,426 writing samples. We present findings for genre- 
based feature distributions, cross-genre correlations, and implications for postsecondary writing education. 


Keywords: natural language processing, writing analytics, higher education 


1 INTRODUCTION 


Writing achievement is a complex skill set as characterized by the sociocognitive writing model 
(Flower, 1994; Hayes, 2012). The model considers multiple subfactors, including writing domain 
knowledge (e.g., sentence structure), general cognitive skills (e.g., critical thinking), and intra- (e.g., 
interest) and interpersonal (e.g., collaboration) subfactors. Postsecondary writing achievement 
studies are needed to critically examine how students apply and develop their writing domain 
knowledge in different genres, since writing achievement may affect broader success measure, @.g., 
GPA (Burstein, McCaffrey, Beigman Klebanov, Ling, & Holtzman, 2019). Such studies have typically 
examined expository essay writing genre (Allen, Snow, Crossley, Jackson, & McNamara, 2014; 
MacArthur, Traga Philippakos, May, & Compello, 2019). Burstein, et al. (2019) used standardized 
writing assessment and coursework writing to examine relationships between automated writing 
evaluation (AWE) features and academic success measures (e.g., GPA); yet, genre was not studied. 
Our study compares writing subconstruct features in student writing as captured by state-of-the-art 
AWE technology (withheld for anonymity) between genres. 
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Figure 1: Standardized writing (blue) has lower sentence variety values than reflective (green), 
persuasive (red), or informative/exploratory (black) sentence variety values. 


2 METHODS 
2.1 Data 


At six diverse, 4-year partner universities, 735 students participated. We collected 1,426 writing 
samples. A subset of students completed a timed, standardized writing assessment requiring an 
argumentative essay (n=366). A partially overlapping subset of students (n=435) submitted 
coursework writing (n=1060) from one course in which their instructor had agreed to participate for 
the study. Courses were primarily first-year English courses, but also included Biology, Business, 
Exercise Science, History, and Sociology courses. Data are available here: 
https://github.com/EducationalTestingService/ies-writing-achievement-study-data. 


2.2 Genre Annotation 


Three research assistants annotated the writing samples with four genre labels. All timed, 
standardized writing assessment responses were labeled as “standardized” (S) and coursework 
writing was coded as one of “persuasive” (P) (33%), “informative/exploratory” (IE) (47%), “reflective” 
(R) (14%), or “other” (5%), using an annotation protocol developed for the study. “Other” 
assignments did not align with the 3 coursework genres, and are not included in this discussion. 


2.3. Data Analysis, Results & Discussion 


Using AWE, we extracted about 50 linguistic features from the standardized and coursework writing 
samples. The feature set represented six writing subconstructs: vocabulary usage, argumentation, 
organization & development, English conventions, sentence structure, and personal reflection. 


Feature Density & Genre. Using visual comparisons of smoothed density plots, and the Kolmogorov- 
Smirnov test for differences in the distributions for the different subconstructs, we observed 
statistically-significant (p<0.001), genre-based differences in AWE feature distributions. For instance, 
more pronouns (i.e., personal reflection) were observed in reflective writing than standardized, 
persuasive, or informative/exploratory writing. Analyses suggested that standardized writing (a) 
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contained less sentence variety (i.e., sentence structure) than the coursework genres (e.g., Figure 1), 
(b) used less sophisticated vocabulary (i.e., vocabulary usage) than coursework genres, and (c) 
tended to discuss one longer topic, (i.e., development), whereas coursework genres contained more 
topic variety. 


Cross-Genre Correlations. We generated ‘subconstruct scores’ for the six writing subconstructs. 
Subconstruct scores were equal to the average of the AWE feature values for the features in each 
subconstruct. Feature values were standardized to have a mean zero and standard deviation 1 prior 
to averaging. Six factor scores were assigned to all writing samples. We ran cross-genre correlations 
to examine relationships between the subconstruct scores for writing samples in each genre pair 
(e.g., R/IE). Coursework genre pairs had the highest correlations for vocabulary usage (0.35 for P/IE, 
and 0.31 for IE/R), English conventions (0.33 for P/IE and 0.35 for IE/R), and sentence structure (0.30 
for IE/R) subconstructs. Correlations between S and the coursework genres all fell below 0.30. 


Implications. Findings from both analyses suggested differences in students’ application of writing 
features across genres. Offering opportunities for students to practice writing in different genres can 
provide instructors and institutions with a more comprehensive picture of students’ writing domain 
knowledge (i.e., writing feature use) and writing achievement. The findings illustrate the limitations 
of observable writing domain knowledge from single-genre standardized writing assessments. 
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